Goto

Collaborating Authors

 East Kalimantan


Personal Intelligence System UniLM: Hybrid On-Device Small Language Model and Server-Based Large Language Model for Malay Nusantara

arXiv.org Artificial Intelligence

In contexts with limited computational and data resources, high-resource language models often prove inadequate, particularly when addressing the specific needs of Malay languages. This paper introduces a Personal Intelligence System designed to efficiently integrate both on-device and server-based models. The system incorporates SLiM-34M for on-device processing, optimized for low memory and power usage, and MANYAK-1.3B for server-based tasks, allowing for scalable, high-performance language processing. The models achieve significant results across various tasks, such as machine translation, question-answering, and translate IndoMMLU. Particularly noteworthy is SLiM-34M's ability to achieve a high improvement in accuracy compared to other LLMs while using 2 times fewer pre-training tokens. This work challenges the prevailing assumption that large-scale computational resources are necessary to build effective language models, contributing to the development of resource-efficient models for the Malay language with the unique orchestration between SLiM-34M and MANYAK-1.3B.


12 futuristic cities being built around the world, from Saudi Arabia to China

#artificialintelligence

With world's population continuing to increase and climate change drastically affecting our environment, many metropolises are struggling to grow, develop and even support citizens within current and traditional urban designs. Governments, entrepreneurs and technology companies are employing some of the world's leading architects and designers to rethink the idea of cities, how people can interact and how to live within them. From reclaimed land, groundbreaking skyscrapers in the desert and cities rising in the metaverse, here are 12 incredible futuristic cities redefining the urban spaces we live in. The $500 billion Neom project in Saudi Arabia is set to be home to a record-setting 170-kilometre-long skyscraper called the Mirror Line. It will be the world's largest structure, comprising of two buildings up to 490 metres tall, running parallel to each other.


Explain It To Me : Confusion Matrix

#artificialintelligence

You can refer to the documentation if you want to learn more. Through this article you've learn about: I hope you can gain basic understanding about confusion matrix and the important metrics for classification task. Remember, never stop to learn & stay awesome!


ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning

arXiv.org Artificial Intelligence

Pre-trained Language Models (PLMs) have shown strong performance in various downstream Natural Language Processing (NLP) tasks. However, PLMs still cannot well capture the factual knowledge in the text, which is crucial for understanding the whole text, especially for document-level language understanding tasks. To address this issue, we propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text. Specifically, (1) to better understand entities, we propose an entity discrimination task that distinguishes which tail entity can be inferred by the given head entity and relation. (2) Besides, to better understand relations, we employ a relation discrimination task which distinguishes whether two entity pairs are close or not in relational semantics. Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks, including relation extraction and reading comprehension, especially under low resource setting. Meanwhile, ERICA achieves comparable or better performance on sentence-level tasks. We will release the datasets, source codes and pre-trained language models for further research explorations.